23 research outputs found

    Finding regions of aberrant DNA copy number associated with tumor phenotype

    Get PDF
    DNA copy number alterations are a hallmark of cancer. Understanding their role in tumor progression can help improve diagnosis, prognosis and therapy selection for cancer patients. High-resolution, genome-wide measurements of DNA copy number changes for large cohorts of tumors are currently available, owing to technologies like microarray-based array comparative hybridization (arrayCGH). In this thesis, we present a computational pipeline for statistical analysis of tumor cohorts, which can help extract relevant patterns of copy number aberrations and infer their association with various phenotypical indicators. The main challenges are the instability of classification models due to the high dimensionality of the arrays compared to the small number of tumor samples, as well as the large correlations between copy number estimates measured at neighboring loci. We show that the feature ranking given by several widely-used methods for feature selection is biased due to the large correlations between features. In order to correct for the bias and instability of the feature ranking, we introduce methods for consensus segmentation of the set of arrays. We present three algorithms for consensus segmentation, which are based on identifying recurrent DNA breakpoints or DNA regions of constant copy number profile. The segmentation constitutes the basis for computing a set of super-features, corresponding to the regions. We use the super-features for supervised classification and we compare the models to baseline models trained on probe data. We validated the methods by training models for prediction of the phenotype of breast cancers and neuroblastoma tumors. We show that the multivariate segmentation affords higher model stability, in general improves prediction accuracy and facilitates model interpretation. One of our most important biological results refers to the classification of neuroblastoma tumors. We show that patients belonging to different age subgroups are characterized by distinct copy number patterns, with largest discrepancy when the subgroups are defined as older or younger than 16-18 months. We thereby confirm the recommendation for a higher age cutoff than 12 months (current clinical practice) for differential diagnosis of neuroblastoma.Die abnormale Multiplizität bestimmter Segmente der DNS (copy number aberrations) ist eines der hervorstechenden Merkmale von Krebs. Das Verständnis der Rolle dieses Merkmals für das Tumorwachstum könnte massgeblich zur Verbesserung von Krebsdiagnose,-prognose und -therapie beitragen und somit bei der Auswahl individueller Therapien helfen. Micoroarray-basierte Technologien wie 'Array Comparative Hybridization' (array-CGH) erlauben es, hochauflösende, genomweite Kopiezahl-Karten von Tumorgeweben zu erstellen. Gegenstand dieser Arbeit ist die Entwicklung einer Software-Pipeline für die statistische Analyse von Tumorkohorten, die es ermöglicht, relevante Muster abnormaler Kopiezahlen abzuleiten und diese mit diversen phänotypischen Merkmalen zu assoziieren. Dies geschieht mithilfe maschineller Lernmethoden für Klassifikation und Merkmalselektion mit Fokus auf die Interpretierbarkeit der gelernten Modelle (regularisierte lineare Methoden sowie Entscheidungsbaum-basierte Modelle). Herausforderungen an die Methoden liegen vor allem in der hohen Dimensionalität der Daten, denen lediglich eine vergleichsweise geringe Anzahl von gemessenen Tumorproben gegenüber steht, sowie der hohen Korrelation zwischen den gemessenen Kopiezahlen in benachbarten genomischen Regionen. Folglich hängen die Resultate der Merkmalselektion stark von der Auswahl des Trainingsdatensatzes ab, was die Reproduzierbarkeit bei unterschiedlichen klinischen Datensätzen stark einschränkt. Diese Arbeit zeigt, dass die von diversen gängigen Methoden bestimmte Rangfolge von Features in Folge hoher Korrelationskoefizienten einzelner Prädiktoren stark verfälscht ist. Um diesen 'Bias' sowie die Instabilität der Merkmalsrangfolge zu korrigieren, führen wir in unserer Pipeline einen dimensions-reduzierenden Schritt ein, der darin besteht, die Arrays gemeinsam multivariat zu segmentieren. Wir präsentieren drei Algorithmen für diese multivariate Segmentierung,die auf der Identifikation rekurrenter DNA Breakpoints oder genomischer Regionen mit konstanten Kopiezahl-Profilen beruhen. Durch Zusammenfassen der DNA Kopiezahlwerte innerhalb einer Region bildet die multivariate Segmentierung die Grundlage für die Berechnung einer kleineren Menge von 'Super-Merkmalen'. Im Vergleich zu Klassifikationsverfahren,die auf Ebene einzelner Arrayproben beruhen, verbessern wir durch überwachte Klassifikation basierend auf den Super-Merkmalen die Interpretierbarkeit sowie die Stabilität der Modelle. Wir validieren die Methoden in dieser Arbeit durch das Trainieren von Vorhersagemodellen auf Brustkrebs und Neuroblastoma Datensätzen. Hier zeigen wir, dass der multivariate Segmentierungsschritt eine erhöhte Modellstabilität erzielt, wobei die Vorhersagequalität nicht abnimmt. Die Dimension des Problems wird erheblich reduziert (bis zu 200-fach weniger Merkmale), welches die multivariate Segmentierung nicht nur zu einem probaten Mittel für die Vorhersage von Phänotypen macht.Vielmehr eignet sich das Verfahren darüberhinaus auch als Vorverarbeitungschritt für spätere integrative Analysen mit anderen Datentypen. Auch die Interpretierbarkeit der Modelle wird verbessert. Dies ermöglicht die Identifikation von wichtigen Relationen zwischen Änderungen der Kopiezahl und Phänotyp. Beispielsweise zeigen wir, dass eine Koamplifikation in direkter Nachbarschaft des ERBB2 Genlokus einen höchst informativen Prädiktor für die Unterscheidung von entzündlichen und nicht-entzündlichen Brustkrebsarten darstellt. Damit bestätigen wir die in der Literatur gängige Hypothese, dass die Grösse eines Amplikons mit dem Krebssubtyp zusammenhängt. Im Fall von Neuroblastoma Tumoren zeigen wir, dass Untergruppen, die durch das Alter des Patienten deniert werden, durch Kopiezahl-Muster charakterisiert werden können. Insbesondere ist dies möglich, wenn ein Altersschwellenwert von 16 bis 18 Monaten zur Definition der Gruppen verwandt wird, bei dem ausserdem auch die höchste Vorhersagegenauigkeit vorliegt. Folglich geben wir weitere Evidenz für die Empfehlung, einen höheren Schwellenwert als zwölf Monate für die differentielle Diagnose von Neuroblastoma zu verwenden

    An Analysis of Event-Agnostic Features for Rumour Classification in Twitter

    No full text
    Recently, much attention has been given to models for identifying rumors in social media. Features that are helpful for automatic inference of credibility, veracity, reliability of information have been described. The ultimate goal is to train classification models that are able to recognize future high-impact rumors as early as possible, before the event unfolds. The generalization power of the models is greatly hindered by the domain-dependent distributions of the features, an issue insufficiently discussed. Here we study a large dataset consisting of rumor and non-rumor tweets commenting on nine breakingnews stories taking place in different locations of the world. We found that the distribution of most features are specific to the event and that this bias naturally affects the performance of the model. The analysis of the domain-specific feature distributions is insightful and hints to the distinct characteristics of the underlying social network for different countries, social groups, cultures and others

    Sensitive Detection of Viral Transcripts in Human Tumor Transcriptomes

    Get PDF
    In excess of 12% of human cancer incidents have a viral cofactor. Epidemiological studies of idiopathic human cancers indicate that additional tumor viruses remain to be discovered. Recent advances in sequencing technology have enabled systematic screenings of human tumor transcriptomes for viral transcripts. However, technical problems such as low abundances of viral transcripts in large volumes of sequencing data, viral sequence divergence, and homology between viral and human factors significantly confound identification of tumor viruses. We have developed a novel computational approach for detecting viral transcripts in human cancers that takes the aforementioned confounding factors into account and is applicable to a wide variety of viruses and tumors. We apply the approach to conducting the first systematic search for viruses in neuroblastoma, the most common cancer in infancy. The diverse clinical progression of this disease as well as related epidemiological and virological findings are highly suggestive of a pathogenic cofactor. However, a viral etiology of neuroblastoma is currently contested. We mapped 14 transcriptomes of neuroblastoma as well as positive and negative controls to the human and all known viral genomes in order to detect both known and unknown viruses. Analysis of controls, comparisons with related methods, and statistical estimates demonstrate the high sensitivity of our approach. Detailed investigation of putative viral transcripts within neuroblastoma samples did not provide evidence for the existence of any known human viruses. Likewise, de-novo assembly and analysis of chimeric transcripts did not result in expression signatures associated with novel human pathogens. While confounding factors such as sample dilution or viral clearance in progressed tumors may mask viral cofactors in the data, in principle, this is rendered less likely by the high sensitivity of our approach and the number of biological replicates analyzed. Therefore, our results suggest that frequent viral cofactors of metastatic neuroblastoma are unlikely

    Fenbendazole and triclabendazole effects on CYP1A1/1A2 and FMO1/3 mRNAs in cattle liver slices: preliminary results

    No full text
    Background. Fenbendazole (FBZ) and triclabendazole (TCBZ) are benzimidazole drugs (BZDs) widely used in veterinary practice as anthelmintics. Members of cytochrome P450 (CYP) and flavin monooxygenase (FMO) superfamilies of drug metabolizing enzymes are primarily responsible of their biotransformation1. The xenobiotic-dependent up-regulation of CYPs is well documented, while FMOs are generally considered not inducible2. In the present study, the effect of FBZ and TCBZ (alone or in combination) on CYP1A1/2 and FMO1/3 mRNA levels were measured on precision-cut bovine liver slices (bLS). Methods. Precision-cut bLS from 6 male cattle were obtained according to Mat\ue9 et al3. Following the fine-tuning of an absolute quantification protocol for target genes4, bLS were incubated for 0, 6 and 12 h with FBZ and TCBZ (50 \ub5M), alone or in combination. \u3b2-naphthoflavone (\u3b2NF, 25 \ub5M) was used as positive control to confirm gene induction. Target gene mRNA levels were measured by qPCR. The TATA Box binding protein and ribosomal protein lateral stalk subunit P0/\u3b2-actin were used as internal control genes to normalize FBZ/TCBZ and \u3b2NF data, respectively. Results. At T0, CYP1A2 mRNA levels were 25-fold higher than CYP1A1; however, FMO1 and 3 were equally represented. \u3b2NF up-regulated CYP1A1 (P<0.05; 4-fold vs control) after 6 h of incubation, while increasing amounts (P<0.05) of CYP1A2, FMO1/3 mRNAs were noticed after 12 h (4-, 2.5- and 3.5-fold vs control, respectively). Concerning BZDs, FBZ increased CYP1A1/2 mRNAs (P<0.05) after 12 h of incubation, whereas TCBZ up-regulated only FMO3 (P<0.05) and after 6 h. No transcriptional effect was ever noticed following bLS exposure to BZDs combination. Conclusion. Likewise to humans, bLS appear to be a reliable model to study CYP1A induction by \uf062NF and other potential aryl hydrocarbon receptor agonists. Meanwhile, for the first time we proved cattle FMO1-3 up-regulation by \uf062NF. About BZDs, FBZ and TCBZ were shown to affect CYP1A1/2 and FMO3 gene expression. Confirmatory studies on CYP1A and FMOs catalytic activities are actually underway

    Effects of fenbendazole and triclabendazole on the expression of cytochrome P450 1A and flavin-monooxygenase isozymes in bovine precision-cut liver slices

    No full text
    Combinations of the anthelmintics fenbendazole (FBZ) and triclabendazole (TCBZ) have shown enhanced efficacy against the liver fluke Fasciola hepatica. This study aimed to measuring the constitutive expression of CYP1A1, CYP1A2, FMO1 and FMO3, thought to be involved in the metabolism of those compounds, by using an absolute quantitative real time (RT)-PCR approach in bovine precision-cut liver slices (PCLS). It also aimed to characterize the effects of FBZ and TCBZ (alone and in combination) on the expression and activity of the aforementioned isozymes. Both FMO1 and FMO3 were equally represented in control PCLS, whereas CYP1A2 was expressed more than CYP1A1 (P<0.05). PCLS cultured in the presence of beta naphthoflavone (\u3b2-NF; CYP1A inducer) had higher mRNA levels of CYP1A1, CYP1A2, FMO1 and FMO3 (P<0.05). No clear-cut evidence of transcriptional effects of the anthelmintics were recorded. After incubation of PCLS with FBZ, there was a significant increase (P<0.05) vs. controls and TBCZ was observed for CYP1A1. PCLS treated with FBZ showed a higher (P<0.05) expression of CYP1A2 compared to controls, TCBZ alone, and the combination FBZ+TCBZ. The gene expression profiles of FMO1 and FMO3 were not affected by the presence of the anthelmintics; the only exception was an upregulation of FMO3 by TCBZ alone. The observed transcriptional effects of the xenobiotics were not mirrored by increased enzyme activities using prototypical substrates of the isozymes under study. Although further confirmatory studies are needed, these results suggest that PCLS represent an alternative in vitro tool for studies on the expression, regulation and function of relevant xenobiotic-metabolizing enzymes in cattl

    Expresión genética de la subfamilia CYP3A en cortes laminares hepáticos de bovino

    No full text
    Existen diferentes modelos in vitro que permiten estudiar la expresión y función de las enzimas que metabolizan xenobióticos (EMX). Entre ellos, los cortes laminares de tejido hepático (liver slices) representan un excelente modelo para caracterizar la regulación de la expresión del sistema enzimático citocromo P450 (CYP). La dexametasona (DEX) es un antiinflamatorio esteroide que modula la expresión de la subfamilia CYP3A en diferentes especies. El objetivo del presente trabajo fue estudiar el nivel de expresión de las distintas isoenzimas CYP3A en cortes laminares de hígado de bovino, como así también estudiar el efecto de la DEX sobre la expresión genética de esta subfamilia. Para ello se prepararon slices hepáticos bovinos utilizando un micrótomo Brendel/Vitron®. Los cortes laminares se incubaron durante 6 y 12 h en ausencia (controles) y en presencia de DEX (0,1, 5 y 50 µM) en el medio de cultivo E de Williams, bajo una atmósfera de O2/CO2 (95/5). Se determinó la viabilidad del tejido hepático por histopatología y contenido de GSH y K+ intracelular. La expresion genética de CYP3A28, 38 y 48 se determinó mediante PCR en tiempo real (qPCR) de manera absoluta, y los efectos de la incubacion con DEX se midieron mediante qPCR utilizando RPLPO como gen de referencia. Los parámetros de viabilidad fueron aceptables hasta las 12 h de incubación. Los niveles de expresión, en orden creciente de expresión, fueron los siguientes CYP 3A28 < 3A48 < 3A38. La DEX no indujo cambios en los niveles de expresión de ninguna de las isoformas de CYP3A bajo estudio. Estos resultados muestran que, si bien la bibliografía postula a la DEX como inductor de la subfamilia CYP3A en modelos murinos y cultivos primarios de hepatocitos humanos, no lo sería en la especie bovina. El estudio de los patrones de expresión genética de las EMX en rumiantes continúa bajo estudio en nuestro laboratorio.Fil: Maté, María Laura. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigación Veterinaria de Tandil. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigación Veterinaria de Tandil. Provincia de Buenos Aires. Gobernación. Comision de Investigaciones Científicas. Centro de Investigación Veterinaria de Tandil; ArgentinaFil: Giantin, Mery. Università di Padova; ItaliaFil: Viviani, Paula. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigación Veterinaria de Tandil. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigación Veterinaria de Tandil. Provincia de Buenos Aires. Gobernación. Comision de Investigaciones Científicas. Centro de Investigación Veterinaria de Tandil; ArgentinaFil: Ballent, Mariana. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigación Veterinaria de Tandil. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigación Veterinaria de Tandil. Provincia de Buenos Aires. Gobernación. Comision de Investigaciones Científicas. Centro de Investigación Veterinaria de Tandil; ArgentinaFil: Tolosi, Roberta. Università di Padova; ItaliaFil: Lifschitz, Adrian Luis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigación Veterinaria de Tandil. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigación Veterinaria de Tandil. Provincia de Buenos Aires. Gobernación. Comision de Investigaciones Científicas. Centro de Investigación Veterinaria de Tandil; ArgentinaFil: Dacasto, Mauro. No especifíca;Fil: Virkel, Guillermo Leon. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigación Veterinaria de Tandil. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigación Veterinaria de Tandil. Provincia de Buenos Aires. Gobernación. Comision de Investigaciones Científicas. Centro de Investigación Veterinaria de Tandil; ArgentinaI Reunión Conjunta: 5° Reunión Internacional de Ciencias Farmacéuticas y la 50° Reunión Anual de la Sociedad Argentina de Farmacología ExperimentalSan LuisArgentinaSociedad Argentina de Farmacología ExperimentalUniversidad Nacional de San Lui

    Mapping rates.

    No full text
    <p>Mapping ratios and depths of neuroblastoma (NB), positive control (POS), and negative control (NEG) panels. Mapped reads are relative to the number of sequenced read pairs that have passed quality control. Depths include reads with multiple mapping locations (‘multimaps’).</p

    Virana's approach to identifying viral transcripts in human tumors.

    No full text
    <p>a) Transcriptome sequence samples are first mapped to a combined set of human and viral reference sequences in a splicing-aware fashion. b) Unmapped or discordantly mapped read pairs are further processed by assembly methods to detect novel viruses or transcript chimeras that may indicate proviral integration events. c) Reads mapping to one or more viral genomes (HITs) are analyzed in an integrated fashion by considering human homologous mapping locations and viral taxonomies. This process results in a number of homologous regions (HOR) for each viral family. HORs are represented as multiple sequence alignments incorporating a wealth of sequence information. Alignments are further enriched by taxonomic annotations and phylogenetic analyses.</p

    Sequencing panel characteristics.

    No full text
    <p>Sequencing characteristics of neuroblastoma (NB), positive control (POS), and negative control (NEG) panels.</p
    corecore